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Abstract The discovery of connections between the distribution of energy levels of 
heavy nuclei and spacings between prime numbers has been one of the most sur¬ 
prising and fruitful observations in the twentieth century. The connection between 
the two areas was first observed through Montgomery’s work on the pair correla¬ 
tion of zeros of the Riemann zeta function. As its generalizations and consequences 
have motivated much of the following work, and to this day remains one of the 
most important outstanding conjectures in the field, it occupies a central role in our 
discussion below. We describe some of the many techniques and results from the 
past sixty years, especially the important roles played by numerical and experimen¬ 
tal investigations, that led to the discovery of the connections and progress towards 
understanding the behaviors. In our survey of these two areas, we describe the com¬ 
mon mathematics that explains the remarkable universality. We conclude with some 
thoughts on what might lie ahead in the pair correlation of zeros of the zeta function, 
and other similar quantities. 
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1 Introduction 

Montgomery’s pair correlation conjecture posits that zeros of L-functions behave 
similarly to energy levels of heavy nuclei. The bridge between these fields is ran¬ 
dom matrix theory, a beautiful subject which has successfully modeled a large vari¬ 
ety of diverse phenomena (see BBBDSIlKrSell for a great example of how varied the 
systems can be). It is impossible in a short chapter to cover all the topics and con¬ 
nections; fortunately there is no need as there is an extensive literature. Our goal is 
therefore to briefly describe the history of the subject and the correspondences, con¬ 
centrating on some of the main objects of interest and past successes, ending with 
a brief tour through a subset of current work and a discussion of some of the open 
questions in mathematics. We are deliberately brief in areas that are well known or 
are extensively covered in the literature, and instead dwell at greater lengths on the 
inspiration from and interpretation through physics (see for example ^2.61 l. as these 
parts of the story are not as well known but deserve to be (both for historical reasons 
as well as the guidance they can offer). 

To this end, we begin with a short introduction to random matrix theory and a 
quick description of the main characters studied in this chapter. We then continue in 
5]with a detailed exposition of the historical development of random matrix theory 
in nuclear physics in the 1950s and 1960s. We note the pivotal role played by the 
nuclear physics experimentalists in gathering data to support the theoretical conjec¬ 
tures; we will see analogues of these when we get to the work in the 1970s and 1980s 
on zeros of L-functions in 1 13.51 One of our main purposes is in fact to highlight the 
power of experimental data, be it data from a lab or a computer calculation, and 
show how attempts to explain such results influence the development and direction 
of subjects. We then shift emphasis to number theory in 0 and see how studies on 
the class number problem led Montgomery to his famous pair correlation conjecture 
for the zeros of the Riemann zeta function. This and related statistics are the focus 
of the rest of the chapter; we describe what they are, what progress has been made 
(theoretically and numerically), and then turn to some open questions. Most of these 
open questions involve how the arithmetic of L-functions influences the behavior; 
remarkably the main terms in a variety of problems are independent of the finer 
properties of L-functions, and it is only in lower order terms (or, equivalently, in the 
rates of convergence to the random matrix theory behavior) that the dependencies 
on these properties surface. We then conclude in 2]with current questions and some 
future trends. 

Acknowledgements. The third named author was partially supported by NSF grant 
DMS1265673. We thank our colleagues and collaborators over the years for many 
helpful discussions on these and related topics. One of us (Miller) was fortunate to 
be a graduate student at Princeton, and had numerous opportunities then to converse 
with John Nash on a variety of mathematical topics. It was always a joy sitting next 
to him at seminars. We are grateful for his kind invitation to contribute to this work, 
and his comments on an earlier draft. 
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1.1 The Early Days: Statistics and Biometrics 

Though our main characters will be energy levels of nuclei and zeros of L-functions, 
the story of random matrix theory begins neither with physics nor with mathematics, 
but with statistics and biometrics. In 1928 John Wishart published an article titled 
The Generalised Product Moment Distribution in Samples from a Normal Multi¬ 
variate IWisI in Biometrika (see IWikl for a history of the journal, which we briefly 
recap). The journal was founded at the start of the century by Francis Gabon, Karl 
Pearson, and Walter Weldon for the study of statistics related to biometrics. In the 
editors’ introduction in the first issue (see also IWikl ). they write: 

It is intended that Biometrika shall serve as a means not only of collecting or publishing 
under one title biological data of a kind not systematically collected or published elsewhere 
in any other periodical, but also of spreading a knowledge of such statistical theory as may 
be requisite for their scientific treatment. 

The question of interest for Wishart was that of estimating covariance matrices. 
The paper begins with a review of work to date on samples from univariate and 
bivariate populations, and issues with the determination of correlation and regres¬ 
sion coefficients. After summarizing some of the work and formulas from Fisher, 
Wishart writes: 

The distribution of the correlation coefficient was deduced by direct integration from this 
result. Further, K. Pearson and V. Romanovsky, starting from this fundamental formula, 
were able to deal with the regression coefficients. Pearson, in 1925, gave the mean value 
and standard deviation of the regression coefficient, while Romanovsky and Pearson, in the 
following year, published the actual distribution. 

After talking about the new problems that arise when dealing with three or more 
variates, he continues: 

What is now asserted is that all sueh problems depend, in the first instance, on the determi¬ 
nation of a fundamental frequency distribution, which will be a generalisation of equation 
(2). It will, in fact, be the simultaneous distribution in samples of the n variances (squared 
standard deviations) and the product moment coefficients. It is the purpose of the 

present paper to give this generalised distribution, and to calculate its moments up to the 
fourth order. The case of three variates will first be eonsidered in detail, and thereafter a 
proof for the general n-fold system will be given. 

In his honor the distribution of the sample covariance matrix (arising from a 
sample from a multivariate normal distribution) is called the Wishart distribution. 
More specifically, if we have annx p matrix X whose rows are independently drawn 
from a p-variate mean 0 normal distribution, the Wishart distribution is the density 
of the p X p matrices X^X. 

Several items are worth noting here. First, we have an ensemble (a collection) of 
matrices whose entries are drawn from a fixed distribution; in this case there are de¬ 
pendencies among the entries. Second, these matrices are used to model observable 
quantities of interest, in this case covariances. Finally, in his article he mentions an 
earlier work of his (published in the Memoirs of the Royal Meteorological Society, 
volume II, pages 29-37, 1928) which experimentally confirmed some of the results 









6 


Contents 


discussed, thus showing the connections between experiment and theory which play 
such a prominent role later in the story also played a key role in the founding. 

It was not until almost thirty years later that random matrix theory, in the hands 
and mind of Wigner, bursts onto the physics scene, and then it will be almost an¬ 
other thirty years more before the connections with number theory emerge. Before 
describing these histories in detail, we end the introduction with a very quick tour 
of some of the quantities and objects we’ll meet. 


1.2 Cast of Characters: Nuclei and L-functions 

The two main objects we study are energy levels of heavy nuclei on the physics 
side, and zeros of the Riemann zeta function (or more generally L-functions) on 
the number theory side, especially Montgomery’s pair correlation conjecture and 
related statistics. We give a full statement of the pair correlation conjecture, and 
results towards its proof, in 113.21 Briefly, given an ordered sequence of events (such 
as zeros on the critical line, eigenvalues of Hermitian matrices, energy levels of 
heavy nuclei) one can look at how often a difference is observed. The remarkable 
conjecture is that these very different systems exhibit similar behavior. 

We begin with a review of some facts about the these areas, from theories for 
their behavior to how experimental observations were obtained which shed light on 
the structures, and then finish the introduction with some hints at the similarities 
between these two very different systems. Parts of that section, as well as much of 
@ are expanded with permission from the survey article IFMI written by two of the 
authors of this chapter for the inaugural issue of the open access journal Symmetry. 
The goal of that article was similar to this chapter, though there the main quantity 
discussed was Wigner’s semi-circle law and not pair correlation. 

Many, if not all, of the other survey articles in the subject concentrate on the 
mathematics and ignore the experimental physics. When writing the survey IIFMI 
the authors deliberately sought a balance, with the intention of sharing and elaborat¬ 
ing on that vantage again in a later work to give a wider audience a more complete 
description of the development of the subjects, as other approaches are already avail¬ 
able in the literature. We especially recommend to the reader Goldston’s excellent 
survey article Notes on pair correlation of zeros and prime numbers (see UGol ) for 
an extended, detailed technical discussion; the purpose of this chapter is to com¬ 
plement this and other surveys by highlighting other aspects of the story, especially 
how Montgomery’s work on the pair correlation of zeros of (s) connects, through 
random matrix theory, a central object of study in number theory to our understand¬ 
ing of the physics of heavy nuclei. 
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1.2.1 Atomic Theory and Nuclei 

Experiments and experimental data played a crucial role in our evolving understand¬ 
ing of the atom. For example, Ernest Rutherford’s gold foil experiment (performed 
by Hans Geiger and Ernest Marsden) near the start of the twentieth century demon¬ 
strated that J. J. Thomson’s plum pudding model of an atom with negatively charged 
electrons embedded in a positively charged region was false, and that the atom had 
a very small positively charged nucleus with the electrons far away. These exper¬ 
iments involved shooting alpha particles at thin gold foils. Alpha particles are he¬ 
lium atoms without the electrons and are thus positively charged. While this positive 
charge was responsible for disproving the plum pudding model, such particles could 
not deeply probe the positively charged nucleus due to the strong repulsion from like 
charges. To make further progress into the structure of the atom in general, and the 
nucleus in particular, another object was needed. A great candidate was the neutron 
(discovered by Chadwick in 1932); as it did not have a net charge, the electric force 
would play an immensely smaller role in its interaction with the nucleus than it did 
with the alpha particles. 

The earliest studies of neutron induced reactions showed that the total neutron 
cross sectiorQ for the interaction of low-energy (electron-volt, eV) neutrons with 
a nucleus is frequently much greater than the geometrical area presented by the 
target nucleus to the incident neutron IFAI . It was also found that the cross section 
varies rapidly as a function of the bombarding energy of the incident neutron. The 
appearance of these well-defined resonances in the neutron cross section is the most 
characteristic feature of low energy nuclear reactions. 

In general, the low energy resonances were found to be closely spaced (spacing 
< 10 eV in heavy nuclei), and to be very narrow (widths <0.1 eV). These facts 
led Niels Bohr to introduce the compound nucleus model IBoll that assumes the in¬ 
teraction between an incoming neutron and the target nucleus is so strong that the 
neutron rapidly shares its energy with many of the target nucleons. The nuclear state 
that results from the combination of incident neutron and target nucleus may there¬ 
fore last until sufficient energy again resides in one of the nucleons for it to escape 
from the system. This is a statistical process, and a considerable time may elapse be¬ 
fore it occurs. The long lifetime of the state (t) (on a nuclear timescale) explains the 
narrow width (E) of the resonance]! Also, since many nucleons are involved in the 
formation of a compound state, the close spacing of the resonances is to be expected 
since there are clearly many ways of exciting many nucleons. The qualitative model 


* A total neutron cross section is defined as 

Number of events of all types per unit time per nucleus 
Number of incident neutrons per unit time per unit area ’ 

and has the dimensions of area (the standard unit is the bam, lO^^^cm^). 

^ The width, F, is related to the lifetime, r, by the uncertainty relation F = h/lKX, where h is 
Planck’s constant. The finite width (lack of energy definition) is due to the fact that a resonant 
state can decay by emitting a particle, or radiation, whereas a state of definite energy must be a 
stationary state. 




Contents 


outlined above has formed the basis of most theoretical descriptions of low-energy, 
resonant nuclear reactions MBWI . 

If a resonant state can decay in a number of different ways (or channels), we 
can ascribe a probability per unit time for the decay into a channel, c, which can be 
expressed as a partial width The total width is the sum of the partial widths, i.e., 
ri=Lcric- 

The appearance of well-defined resonances occurs in heavy nuclei (mass num¬ 
ber A > 100, say) for incident neutron energies up to about 100 keV, and in light 
nuclei up to neutron energies of several MeV. As the neutron bombarding ener¬ 
gies are increased above these energies, the total cross sections are observed to be¬ 
come smoother functions of neutron energy llHSl . This is due to two effects: firstly, 
the level density (i.e., the number of resonances per unit energy interval) increases 
rapidly as the excitation energy of the compound nucleus is increased, and secondly, 
the widths of the individual resonances tend to increase with increasing excitation 
energy so that, eventually, they overlap. The smoothed-out cross sections provide 
useful information on the average properties of resonances. One of the most sig¬ 
nificant features of these cross sections is the appearance of gross fluctuations that 
have been interpreted in terms of the single-particle nature of the neutron-nucleus 
interaction KTWI . These giant resonances form one of the main sources of exper¬ 
imental evidence for introducing the successful optical model of nuclear reactions. 
This model represents the interaction between a neutron and a nucleus in terms of 
the neutron moving in a complex potential well MOBJI in which the imaginary part 
allows for the absorption of the incident neutron. 

Experimental results show that, on increasing the bombarding energy above 
about 5 MeV, a different reaction mechanism may occur. For example, the en¬ 
ergy spectra of emitted nucleons frequently contain too many high-energy nucleons 
compared with the predictions of the compound nucleus model. The mechanism no 
longer appears to be one in which the incident neutron shares its energy with many 
target nucleons but is one in which the neutron interacts with a single nucleon or, 
at most, a few nucleons. Such a mechanism is termed a direct interaction, which is 
defined as a nuclear reaction in which only a few of the available degrees of freedom 
of the system are involved lASl . 

The optical model, mentioned above, is an important example of a direct inter¬ 
action that takes place even at low bombarding energies. The incident neutron is 
considered to move in the mean nuclear potential of all the nucleons in the target. 
This model also has been used to account for anomalies in the spectra of gamma- 
rays resulting from thermal neutron capture in ED. 

At even higher bombarding energies, greater than 50 MeV, say, the mechanism 
becomes clearer in the sense that direct processes are the most important. The reac¬ 
tions then give information on the fundamental nucleon-nucleon interaction; these 
studies and their interpretation are, however, outside the scope of the present discus¬ 
sion. 

When a low-energy neutron (energy <10 keV, say) interacts with a nucleus the 
excitation energy of the compound nucleus is greatly increased by the neutron bind¬ 
ing energy that typically ranges from 5 to 10 MeV. In the late 1950s, experimental 
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methods were developed for measuring low-energy neutrons with resolutions of a 
few electron-volts. This meant that, for the first time in any physical system, it be¬ 
came possible to study the fine structure of resonances at energies far above the 
ground state of the system. The relevant experimental methods are discussed in ^ 
Important information was thereby obtained concerning the properties that char¬ 
acterizes the resonances such as their peak cross sections, elastic scattering widths, 
and adjacent spacing. The results were used to test the predictions of various nuclear 
models used to describe the interactions. These models ranged from the Fermi Gas 
Model, a quantized version of classical Statistical Mechanics and Thermodynamics 
IIBel . to the sophisticated Nuclear Shell Model IBWI . In the mid-1950s, all Statis¬ 
tical Mechanics Models predicted that the spacing distribution of nearest-neighbor 
resonances of the same spin and parity in a heavy nucleus (mass number A > 100, 
say) was an exponential distribution. By 1956, the experimental evidence on the 
spacing distribution of s-wave resonances in a number of heavy nuclei indicated a 
lack of very closely-spaced resonances, contradicting the predictions of an expo¬ 
nential distribution EHl. By 1960, two research groups ORDRHI IFLMI showed, 
unequivocally, that the spacing distribution of resonances up to an energy of almost 
2 keV followed the prediction of the random matrix model surmised by Wigner in 
1956 |Wig5|; in his model the probability of a zero spacing is zero! It is a model 
rooted in statistics, which interestingly is where our story on random matrix theory 
began! 


1.2.2 L-functions and Their Zeros 

There are many excellent introductions, at a variety of levels, to number theory and 
L-functions. We assume the reader is familiar with the basics of the subject; for 
more details see among others MDallEdlTHWIlIKIlMT-BlISel . The discussion below 
is a quick review and is an abridgement (and slight expansion) of IFMI . which has 
additional details. 

The primes are the building blocks of number theory: every integer can be written 
uniquely as a product of prime powers. Note that the role played by the primes 
mirrors that of atoms in building up molecules. One of the most important questions 
we can ask about primes is also one of the most basic: how many primes are there 
at most xl In other words, how many building blocks are there up to a given point? 

Euclid proved over 2000 years ago that there are infinitely many primes; so, if we 
let n{x) denote the number of primes at most x, we know lim^-^o, n{x) = Though 
Euclid’s proof is still used in courses around the world (and gives a growth rate on 
the order of loglogx), one can obtain much better counts on n{x). 

The prime number theorem states that the number of primes at most x is 
Li(x) -|-o(Li(x)), where Li(x) = / 2 ^(if/logf and forx large, Li(x) is approximately 
x/logx, and /(x) = o(g(x)) means limj;j.„/(x)/g(x) = 0. While it is possible to 
prove the prime number theorem elementarily OErdI ISel2l . the most informative 
proofs use complex numbers and complex analysis, and lead to the fascinating con¬ 
nection between number theory and nuclear physics. One of the most fruitful ap- 
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proaches to understanding the primes is to understand properties of the Riemann 
zeta function, which is defined for Re(i) > Iby 


cw 



( 1 . 1 ) 


the series converges for Re(i) > 1 by the integral test. By unique factorization, we 
may also write i^(s) as a product over primes. To see this, use the geometric series 
formula to expand (1 —as '^^at occurs exactly once 

on each side (and clearly every term from expanding the product is of the form 
for some n). This is called the Euler product of (i), and is one of its most important 
properties; 


C(^) 



n 

p prime 



1 


( 1 . 2 ) 


Initially defined only for Re(i) > 1, using complex analysis the Riemann zeta func¬ 
tion can be meromorphically continued to all of C, having only a simple pole with 
residue 1 at s = 1. It satisfies the functional equation 


= l.(.-l)r(0;r-fC(^) = ^(1-^). (1.3) 

One proof is to use the Gamma function, r{s) = . A simple change of 

variables gives 

(1.4) 

Summing over n represents a multiple of (s) as an integral. After some algebra we 
find 

F (^) i^(s) = J x'^^^'^(o{x)dx + j c/x, (1.5) 

2 

with (o{x) = Lr=i ^ Using Poisson summation, we see 


which yields 



1 

2 


1 1 1 
—-X2 +x’^ CO 
2 




^ (0 1) +/ ^)®(xyx, 


( 1 . 6 ) 


(1.7) 


from which the claimed functional equation follows. 

The distribution of the primes is a difficult problem; however, the distribution of 
the positive integers is not and has been completely known for quite some time! The 
hope is that we can understand 1 /n^ as this involves sums over the integers, and 
somehow pass this knowledge on to the primes through the Euler product. 



Contents 


11 


Riemann ED (see Elllldl for an English translation) observed a fascinating con¬ 
nection between the zeros of (s) and the error term in the prime number theorem. 
As this relation is the starting point for our story on the number theory side, we de¬ 
scribe the details in some length. One of the most natural things to do to a complex 
function is to take contour integrals of its logarithmic derivative; this yields informa¬ 
tion about zeros and poles, and we will see later in (I1.171 i that we can get even more 
information if we weigh the integral with a test function. There are two expressions 
for however, for the logarithmic derivative it is clear that we should use the 
Euler product over the sum expansion, as the logarithm of a product is the sum of 
the logarithms. Let 


A{n) 


log p if n = p'' for some integer r 
0 otherwise. 


( 1 . 8 ) 


Wehnd 

= = a.9) 

Cw P 1-E « 

(this is proved by using the geometric series formula to write (1 — as 

I^r=o I/e*’ collecting terms and then using the dehnition of A (n)). Moving the neg¬ 
ative sign over and multiplying by x'/i, we hnd 


1 

iTti 





ds = ^A(h) 


^ s ds 

5 

s 


( 1 . 10 ) 


where we are integrating over some line Re(i) = c > 1. The integral on the right 
hand side is 1 if u < x and 0 if u > x (by choosing x non-integral, we do not need 
to worry about x = n), and thus gives Y.n<x-^ (^0- shifting contours and keeping 
track of the poles and zeros of C(i), the residue theorem implies that the left hand 
side is 

„ xP 

X- E (111) 

p:C(p)=0 ^ 

the X term comes from the pole of (s) at s = 1 (remember we count poles with a 
minus sign), while the xP /p term arises from zeros; in both cases we must multiply 
by the residue, which is xP/p (it can be shown that i^(s) has neither a zero nor a 
pole at i = 0). Some care is required with this sum, as ^ l/|p | diverges. The solution 
involves pairing the contribution from p with p; see for example IDall . 

The Riemann zeta function vanishes whenever p is a negative even integer; we 
call these the trivial zeros. These terms contribute = — 5 log(l — 

x^^). This leads to the following beautiful formula, known as the explicit formula: 

L ^-xiog(i-x-2) = E^(«) 

p:Re(p)e(0,l) P ^ n<x 

C(P)=o 


X — 


( 1 . 12 ) 
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If we write n as p'', the contribution from all p'' pieces with r > 2 is bounded by 
2x*/^logA: for x large, thus we really have a formula for the sum of the primes at 
most X, with the prime p weighted by log p. Through partial summation, knowing 
the weighted sum is equivalent to knowing the unweighted sum. 

We can now see the connection between the zeros of the Riemann zeta func¬ 
tion and counting primes at most x. The contribution from the trivial zeros is well- 
understood, and is just — j log(l — x^^). The remaining zeros, whose real parts are 
in [0,1], are called the non-trivial or critical zeros. They are far more important and 
more mysterious. The smaller the real part of these zeros of the smaller the 
error. Due to the functional equation, however, if (p) = 0 for a critical zero p then 
1^(1 — p) = 0 as well. Thus the ‘smallest’ the real part can be is 1/2. This is the 
celebrated Riemann Hypothesis (RH), which is probably the most important mathe¬ 
matical aside ever in a paper. Riemann llQl lEdl IrU wrote (translated into English; 
note when he talks about the roots being real, he’s writing the roots as 1/2-1- ij, and 
thus 7 G R is the Riemann Hypothesis): 

One now finds indeed approximately this number of real roots within these limits, and it is 
very probable that all roots are real. Certainly one would wish for a stricter proof here; I 
have meanwhile temporarily put aside the search for this after some fleeting futile attempts, 
as it appears unnecessary for the next objective of my investigation. 


Though not mentioned in the paper, Riemann had developed a terrific formula for 
computing the zeros of had checked (but never reported!) that the first 

few were on the critical line Re(s) = 1 /2. His numerical computations were only 
discovered decades later when Siegel was looking through Riemann’s papers. 

RH has a plethora of applications throughout number theory and mathematics; 
counting primes is but one of many. The prime number theorem is in fact equivalent 
to the statement that Re(p) < 1 for any zero of was first proved indepen¬ 

dently by Hadamard Eidl and de la Vallee Poussin ETVPl in 1896. Each proof 
crucially used results from complex analysis, which is hardly surprising given that 
Riemann had shown 7r(x) is related to the zeros of the meromorphic function ^{s). 
It was not until almost 50 years later that Erdos lErdll and Selberg IISel2l obtained 
elementary proofs of the prime number theorem (in other words, proofs that did 
not use complex analysis, which was quite surprising as the prime number theorem 
was known to be equivalent to a statement about zeros of a meromorphic function). 
See IIGol4l for some commentary on the history of elementary proofs. It is clear, 
however, that the distribution of the zeros of the Riemann zeta function will be of 
primary (in both senses of the word!) importance. 

The Riemann zeta function is the first of many similar functions that we can 
study. We assume the reader has seen L-functions before; in addition to the surveys 
mentioned earlier, see also the introductory remarks in OILSIIRSI . We can examine, 
for the real part of s sufficiently large. 





«=1 


n‘ 


(1.13) 
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of course, while we can create such a function for any sequence {af{n)} of suf¬ 
ficient decay, only certain choices will lead to useful objects whose zeros encode 
the solution to questions of arithmetic interest. For example, if we chose af arising 
from Dirichlet characters we obtain information about primes in arithmetic pro¬ 
gression, while taking af{p) to count the number of solutions to an elliptic curve 

=x^ + Ax+ B modulo p yields information about the rank of the group of ratio¬ 
nal solutions. 

Our previous analysis, where many of our formulas are due to taking the loga¬ 
rithmic derivative and computing a contour integral, suggests that we insist that an 
Euler product hold; 


L{s,f) = = n ( 1 - 14 ) 

17=1 p prime 

Further, we want a functional equation relating the values of the completed L- 
function at s and 1 — s, which allows us to take the series expansion that originally 
converges only for real part of s large and obtain a function defined everywhere: 

A{sJ) = Uis,f)LisJ) = (1.15) 


where Ef, the sign of the functional equation, is of absolute value 1, and 


(■*)/) 


i^oe(5,/) 


j=l 


(1.16) 


with A 7 ^ 0 a complex number, Q > 0, (Xf-j > 0 and af-j{py = af{p''). For 
‘nice’ L-functions, it is believed that the Generalized Riemann Hypothesis (GRH) 
holds: All non-trivial zeros real part equal to 1/2. 

We end our introduction to our main number theoretic objects of interest by not¬ 
ing that (11.12b is capable of massive generalization, not just to other L-functions but 
we can multiply ( 11.9b by a nice test function 0(i) instead of the specific function 
r* /s. The result of this choice is to have a formula that relates sums of (j) at zeros 
of our L-function to sums of the Fourier transform of (j) at the primes. For example 
(see Section 4 of IIFSII ) one can show 



2E L Mp'')^ 



logf" 

/7''/2logR’ 


(1.17) 


where R is a free scaling parameter chosen for the problem of interest, A = 
2^{0)logQ + Y!j=iAj with 




(1.18) 
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and the Fourier transform is defined by 

Hy) ■= [ (1-19) 

J —oo 


1.2.3 From the Hilbert-Polya Connection to Random Matrix Theory 


As stated earlier, the Generalized Riemann Hypothesis asserts that the non-trivial 
zeros of the an L-function are of the form p = 1/2+ iYp with yp real. Thus it makes 
sense to talk about the distribution between adjacent zeros. Around 1913, Polya 
conjectured that the yp the eigenvalues of a naturally occurring, unbounded, 
self-adjoint operator, and are therefore realU Later, Hilbert contributed to the con¬ 
jecture, and reportedly introduced the phrase ‘spectrum’ to describe the eigenvalues 
of an equivalent Hermitian operator, apparently by analogy with the optical spectra 
observed in atoms. This remarkable analogy pre-dated Heisenberg’s Matrix Me¬ 
chanics and the Hamiltonian formulation of Quantum Mechanics by more than a 
decade. 

Not surprisingly, the Hilbert-Polya conjecture was considered so intractable that 
it was not pursued for decades, and random matrix theory remained in a dormant 
state. To quote Diaconis IDilll : 

Historically, random matrix theoiy was started by statisticians iMil studying the coirela- 
tions between different features of population (height, weight, income...). This led to cor¬ 
relation matrices with {i,j) entry the correlation between the t'th and yth features. If the 
data were based on a random sample from a larger population, these correlation matrices 
are random; the study of how the eigenvalues of such samples fluctuate was one of the first 
great accomplishments of random matrix theory. 


Diaconis IIDi2l has given an extensive review of random matrix theory from the 
perspective of a statistician. A strong argument can be made, however, that random 
matrix theory, as we know it today in the physical sciences, began in a formal math¬ 
ematical sense with the Wigner surmise | Wig5| concerning the spacing distribution 
of adjacent resonances (of the same spin and parity) in the interactions between 
low-energy neutrons and nuclei, which we describe in great detail in 0 


2 The ‘Birth’ of Random Matrix Theory in Nuclear Physics 


Below we discuss some of the history of investigations of the nucleus, concentrating 
on the parts that led to the introduction of random matrix theory to the subject. As 
mentioned earlier, this section is expanded with permission from MFMII . Our goal is 


^ If V is an eigenvector with eigenvalue A of a Hermitian matrix A (so A = A* with A* the complex 
conjugate transpose of A, then v*(Av) = v*(A*v) = (Av)*v; the first expression is A||v||^ while the 
last is A||v||^, with ||v||^ = v*v = L|v/|^ non-zero. Thus A = A, and the eigenvalues are real. This is 
one of the most important properties of Hennitian matrices, as it allows us to order the eigenvalues. 
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to provide the reader with both sides of the coin, highlighting the interplay between 
theory and experiment, and building the basis for applications to understanding ze¬ 
ros of L-functions; we have chosen to spend a good amount of space on these exper¬ 
iments and conjectures as these are less-well known to the general mathematician 
than the later parts of our story. 

While other methods have since been developed, random matrix theory was the 
first to make truly accurate, testable predictions. The general idea is that the behavior 
of zeros of L-functions are well-modeled by the behavior of eigenvalues of certain 
matrices. This idea had previously been successfully used to model the distribution 
of energy levels of heavy nuclei (some of the fundamental papers and books on 
the subject, ranging from experiments to theory, include OBFFMPWl IDLLI Dyl 


D^lFLMllFRGllFoRlFKPniGaullHHllHPBllHullMehlllMeh2llMGllMT-BllP^ 


Wigl[ Wig2[ Wig3[ Wig4 Wig5 Wig6| ). We describe the development of random 
matrix theory in nuclear physics below, and then delve into more of the details of 
the connection between the two subjects. 


2.1 Neutron Physics 

The period from the mid-1930s to the late 1970s was the golden age of neutron 
physics; widespread interest in understanding the physics of the nucleus, coupled 
with the need for accurate data in the design of nuclear reactors, made the field 
of neutron physics of global importance in fundamental physics, technology, eco¬ 
nomics, and politics. In ^1.2.1l we introduced some of the early models for nuclei, 
and discussed some of the original experiments. In this section we describe later 
work where better resolution was possible. Later we will show how a similar per¬ 
spective and chain of progress holds in studies of zeros of the Riemann zeta func¬ 
tion! Thus the material here, in addition to being of interest in its own right, will 
also provide a valuable vantage for study of arithmetic objects. 

In the mid-1950s, a discovery was made that turned out to have far-reaching 
consequences beyond anything that those working in the field could have imagined. 
For the first time, it was possible to study the microstructure of the continuum in a 
strongly-coupled, many-body system, at very high excitation energies. This unique 
situation came about as the result of the following facts. 

• Neutrons, with kinetic energies of a few electron-volts, excite states in compound 
nuclei at energies ranging from about 5 million electron-volts to almost 10 mil¬ 
lion electron-volts - typical neutron binding energies. Schematically, see Figure 

ffl 

• Low-energy resonant states in heavy nuclei (mass numbers greater than about 
100) have lifetimes in the range 10^^^ to 10^*^ seconds, and therefore they have 
widths of about 1 eV. The compound nucleus loses all memory of the way in 
which it is formed. It takes a relatively long time for sufficient energy to reside in 
a neutron before being emitted. This is a highly complex, statistical process. In 
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Fig. 1 An energy-level diagram showing the location of highly-excited resonances in the com¬ 
pound nucleus formed by the interaction of a neutron, n, with a nucleus of mass number A. Nature 
provides us with a narrow energy region in which the resonances are clearly separated, and are 
observable. 


heavy nuclei, the average spacing of adjacent resonances is typically in the range 
from a few eV to several hundred eV. 

• Just above the neutron binding energy, the angular momentum barrier restricts 
the possible range of values of total spin of a resonance, J (J = I -i- i -i- 1 , where I 
is the spin of the target nucleus, i is the neutron spin, and 1 is the relative orbital 
angular momentum). This is an important technical point. 

• The neutron time-of-flight method provides excellent energy resolution at ener¬ 
gies up to several keV. (See Firk lED for a review of time-of-flight spectrometers.) 

The speed v„ of a neutron can be determined by measuring the time f„ that it takes 
to travel a measured distance £ in free space. Using the standard result of special 
relativity, the kinetic energy of the neutron can be deduced using the equation 

E„ = £o[(1-v2/c2)-'/2_1] 


( 2 . 1 ) 
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where Eq « 939.553 MeV is the rest energy of the neutron and c « 2.997925 • 10^ 
m/s is the speed of light. 

If the units of energy are MeV, and those of length and time are meters and 
nanoseconds, then 

En = 939.553[(l-11.126496fVf«)^‘^^-l]MeV. (2.2) 

It is frequently useful to rearrange this equation to give the ratio for a given 
energy, 


tnjl = 3.3356404/-^! - (939.553/(£„ + 939.553))2. (2.3) 

Typical values for this ratio are 72.355 ns/m for E„ = 1 MeV and 23.044 ns/m for 
£„ = 10 MeV. 

At energies below 1 MeV, the non-relativistic approximation to (12.3b is adequate; 

{tn/lW = ^£o/2£„c2 = 72.298/v^ ^^/m. (2.4) 

In the eV-region, it is usual to use units of /is/m: a 1 eV neutron travels 1 meter 
in 72.3 microseconds. At non-relativistic energies, the energy resolution AE at an 
energy E is simply: 

AE Ki lEAtjtE, (2.5) 

where Af is the total timing uncertainty, and is the flight time for a neutron of 
energy E. 

In 1958, the two highest-resolution neutron spectrometers in the world had total 
timing uncertainties At ~ 200 nanoseconds. For a flight-path length of 50 meters 
the resolution was AF « 3 eV at 1 keV. 

In -f n, the excitation energy is about 5 MeV; the effective resolution for a 
1 keV-neutron was therefore 


A£/£effective ~ 6-10-" (2.6) 

(at 1 eV, the effective resolution was about 10^^*). 

Two basic broadening effects limit the sensitivity of the method. 

1. Doppler broadening of the resonance profile due to the thermal motion of the 
target nuclei; it is characterized by the quantity 5 ~ Q3\jE/A (eV), where A is 
the mass number of the target. If £ = 1 keV and A = 200, 5 ~ 0.7 eV, a value 
that may be ten times greater than the natural width of the resonance. 

2. Resolution broadening of the observed profile due to the finite resolving power 
of the spectrometer. For a review of the experimental methods used to measure 
neutron total cross sections see Firk and Melkonian IFMel . Lynn |Ly) has given 
a detailed account of the theory of neutron resonance reactions. 

In the early 1950s, the field of low-energy neutron resonance spectroscopy was 
dominated by research groups working at nuclear reactors. They were located at 







18 


Contents 


National Laboratories in the United States, the United Kingdom, Canada, and the 
former USSR. The energy spectrum of fission neutrons produced in a reactor is 
moderated in a hydrogenous material to generate an enhanced flux of low-energy 
neutrons. To carry out neutron time-of-flight spectroscopy, the continuous flux from 
the reactor is “chopped” using a massive steel rotor with fine slits through it. At 
the maximum attainable speed of rotation (about 20,000 rpm), and with slits a few 
thousandths-of-an-inch in width, it is possible to produce pulses each with a duration 
approximately 1 /rsec. The chopped beams have rather low fluxes, and therefore the 
flight paths are limited in length to less than 50 meters. The resolution at 1 keV is 
then AE k. 20 eV, clearly not adequate for the study of resonance spacings about 10 
eV. 

In 1952, there were only four accelerator-based, low-energy neutron spectrom¬ 
eters operating in the world. They were at Columbia University in New York City, 
Brookhaven National Laboratory, the Atomic Energy Research Establishment, Har¬ 
well, England, and at Yale University. The performances of these early accelerator- 
based spectrometers were comparable with those achieved at the reactor-based fa¬ 
cilities. It was clear that the basic limitations of the neutron-chopper spectrometers 
had been reached, and therefore future developments in the held would require im¬ 
provements in accelerator-based systems. 

In 1956, a new high-powered injector for the electron gun of the Harwell electron 
linear accelerator was installed to provide electron pulses with very short durations 
(typically less than 200 nanoseconds) IIERGI . The pulsed neutron flux (generated 
by the ( 7 , n) reaction) was sufficient to permit the use of a 56 meter flight path; an 
energy resolution of 3 eV at 1 keV was achieved. 

At the same time. Professors Havens and Rainwater (pioneers in the held of neu¬ 
tron time-of-flight spectroscopy) and their colleagues at Columbia University were 
building a new 385 MeV proton synchrocyclotron a few miles north of the campus 
(at the Nevis Laboratory). The accelerator was designed to carry out experiments 
in meson physics and low-energy neutron physics (neutrons generated by the (p, n) 
reaction). By 1958, they had produced a pulsed proton beam with duration of 25 
nanoseconds, and had built a 37 meter flight path URDRHIIDRRHII . The hydroge¬ 
nous neutron moderator generated an effective pulse width of about 200 nanosec¬ 
onds for 1 keV-neutrons. In 1960, the length of the flight path was increased to 
200 meters, thereby setting a new standard in neutron time-of-flight spectroscopy 
aORPHI . 


2.2 The Wigner Surmise 

At a conference on Neutron Physics by Time-of-Elight, held in Gatlinburg, Ten¬ 
nessee on November 1st and 2nd, 1956, Professor Eugene Wigner (Nobel Laureate 
in Physics, 1963) presented his surmise regarding the theoretical form of the spacing 
distribution of adjacent neutron resonances (of the same spin and parity) in heavy 
nuclei. At the time, the prevailing wisdom was that the spacing distribution had a 
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Poisson form (see, however, EEl). The limited experimental data then available 
was not sufficiently precise to fix the form of the distribution (see m i. The fol¬ 
lowing quotation, taken from Wigner’s presentation at the conference, introduces 
the concept of random matrices in Physics, for the first time: 

Perhaps I am now too courageous when I try to guess the distribution of the distances be¬ 
tween successive levels. I should re-emphasize that levels that have different /-values (total 
spin) are not connected with each other. They are entirely independent. So far, experimental 
data are available only on even-even elements. Theoretically, the situation is quite simple if 
one attacks the problem in a simple-minded fashion. The question is simply ‘what are the 
distances of the characteristic values of a symmetric matrix with random coefficients?’ 

We know that the chance that two such energy levels coincide is infinitely unlikely. We 
consider a two-dimensional matrix, ( |, in which case the distance between two 

V «21 022 J 

levels is {an— 022 )^ - distance can be zero only if an = 022 and 012 = 0. The 

difference between the two energy levels is the distance of a point from the origin, the two 
coordinates of which are (an — 022 ) and an- The probability that this distance is S is, for 
small values of 5, always proportional to S itself because the volume element of the plane 
in polar coordinates contains the radius as a factor.... 

The probability of finding the next level at a distance 5 now becomes proportional to SdS. 
Hence the simplest assumption will give the probability 

|p2exp(-|p252)s/5 (2.7) 

for a spacing between S and S -|- dS. 

If we put X = p5 = 5/(5), where (5) is the mean spacing, then the probability distribution 
takes the standard form 

p{x)dx = — X exp (—;r.x^/4) r/x, (2.8) 

where the coefficients are obtained by normalizing both the area and the mean to unity. 

The form of the Wigner surmise had been previously discussed by Wigner 
mm , and by Landau and Smorodinsky iLSl , but not in the spirit of random matrix 
theory. 

The Wigner form, in which the probability of zero spacing is zero, is strikingly 
different from the Poisson form 


p{x)dx = exp(—x)tfx 


(2.9) 


in which the probability is a maximum for zero spacing. The form of the Wigner 
surmise had been previously discussed by Wigner himself | Wigl) , and by Landau 
and Smorodinsky iLSl . but not in the spirit of random matrix theory. 

It is interesting to note that the Wigner distribution is a special case of a general 
statistical distribution, named after Professor E. H. Waloddi Weibull (1887-1979), a 
Swedish engineer and statistician II Weil . For many years, the distribution has been 
in widespread use in statistical analyses in industries such as aerospace, automotive, 
electric power, nuclear power, communications, and life insuranceO The distribution 


In fact, one of the authors has used Weibull distributions to model run production in major league 
baseball, giving a theoretical justification for Bill James’ Pythagorean Won-Loss formula (Mini. 
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gives the lifetimes of objects and is therefore invaluable in studies of the failure rates 
of objects under stress (including people!). The Weibull probability density function 
is 



( 2 . 10 ) 


where x > 0, A: > 0 is the shape parameter, and A > 0 is the scale parameter. We 
see that Wei(x;2,2/v^) = p{x), the Wigner distribution. Other important Weibull 
distributions are given in the following list. 

• Wei(x; 1,1)= exp(—x) the Poisson distribution; 

• Wei(x;2,A) = Ray(A), the Rayleigh distribution; 

• Wei (x; 3, A) is approximately a normal distribution^ 

For Wei(x;A:,A), the mean is AF(1 + (1/^)), the median is Alog(2)*/^, and 
the mode is X{k— if A: > 1. As A: —oo, the Weibull distribution has 

a sharp peak at X. Historically, Frechet introduced this distribution in 1927, and 
Nuclear Physicists often refer to the Weibull distribution as the Brody distribution 
OBFFMPWI . 

At the time of the Gatlinburg conference, no more than 20 s-wave neutron reso¬ 
nances had been clearly resolved in a single compound nucleus and therefore it was 
not possible to make a definitive test of the Wigner surmise. Immediately following 
the conference, J. A. Harvey and D. J. Hughes IIHHI . and their collaborators, work¬ 
ing at the fast-neutron-chopper-groups at the high flux reactor at the Brookhaven 
National Laboratory, and at the Oak Ridge National laboratory, gathered their own 
limited data, and all the data from neutron spectroscopy groups around the world, 
to obtain the first global spacing distribution of s-wave neutron resonances. Their 
combined results, published in 1958, showed a distinct lack of very closely spaced 
resonances, in agreement with the Wigner surmise. 

By late 1959, the experimental situation had improved, greatly. At Columbia 
University, two students of Professors Havens and Rainwater completed their Ph.D. 
theses; one, Joel Rosen ORDRHI . studied the first 55 resonances in -|- n up to 
1 keV, and the other, J Scott Desjardins ODRRHL studied resonances in two silver 
isotopes (of different spin) in the same energy region. These were the first results 
from the new high-resolution neutron facility at the Nevis cyclotron. 

At Harwell, Firk, Lynn, and Moxon OFLMII completed their study of the first 100 
resonances in -fn at energies up to 1.8 keV; their measurement of the total 
neutron cross section for the interaction -f n in the energy range 400-1800 eV 
is shown in Figure|2] 

When this experiment began in 1956, no resonances had been resolved at ener¬ 
gies above 500 eV. The distribution of adjacent spacings of the first 100 resonances 
in the single compound nucleus, + n, ruled out an exponential distribution and 


^ Obviously this Weibull cannot be a normal distribution, as they have very different decay rates 
for large x, and this Weibull is a one-sided distribution! What we mean is that for 0 < x < 2 this 
Weibull is well approximated by a normal distribution which shares its mean and variance, which 
are (respectively) r(4/3) « .893 andr(5/3) -r(4/3)2 .105. 












Contents 


21 



Fig. 2 High resolution studies of the total neutron cross section of in the energy range 400 
eV - 1800 eV. The vertical scale (in units of "bams”) is a measure of the effective area of the target 
nucleus. 


provided the best evidence (then available) in support of Wigner’s proposed distri¬ 
bution. 

Over the last half-century, numerous studies have not changed the basic findings. 
At the present time, almost 1000 s-wave neutron resonances in the compound nu¬ 
cleus have been observed in the energy range up to 20 keV. The latest results, 
with their greatly improved statistics, are shown in Figure!^ IIDLLI . 


2.3 Some Nuclear Models 

It is interesting to note that, during the 1950s and 1960s, the study of the spacing 
distribution of neutron-induced resonances was far from the main stream of research 
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Fig. 3 A Wigner distribution fitted to the spacing distribution of 932 s-wave resonances in the 
interaction + n at energies up to 20 keV. 


in nuclear physics; almost all research was concerned with fundamental questions 
associated with nuclear structure and not with quantum statistical mechanics. The 
newly-discovered Shell Model OMal ILEI of nuclei, and developments such as the 
Collective Model BRal IBMI were popular, and quite rightly so, when the successes 
of these models in accounting for the observed energies, spins and parities, and 
magnetic moments of nuclear states, particularly in light nuclei (mass numbers < 
20 , say) were considered. 

These models were not able to account for the spacing distributions in heavy 
nuclei (mass numbers A > 150); the complex nature of so many strongly interacting 
nucleons prevented any detailed analysis. However, the treatment of such complex 
problems had been considered in the mid-1930s, before the advent of the Shell- 
Model. The Fermi Gas Model and other approaches based upon quantum versions 
of classical statistical mechanics and thermodynamics, were introduced, particularly 
by Bethe IBeL The Fermi Gas Model treats the nucleons as non-interacting spin-j 
particles in a confined volume of nuclear size. This, of course, seems at variance 
with the known strong interaction between pairs of nucleons. However, the argument 
is made that the nuclear gas is completely degenerate and therefore, because of the 
Pauli exclusion principle, the nucleons can be considered free! The model was the 
first to predict the energy-dependence of the density of states in the nuclear system. 

The number of states that are available to a freely moving particle in a volume V 
(the nuclear volume) that has a linear momentum in the range p to p-\-dp 'm 

dn = {AnV/h^)p^dp. (2.11) 

This leads to 

n = {V/37zV)pi,,, (2.12) 

where the result has been doubled because of the twofold spin degeneracy of the 
nucleons. The “Fermi energy” Ef corresponds to the maximum momentum: 
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(2.13) 


The level density p {E*) at an excitation energy E* predicted by the model is 

p{E*) = p(0)exp(2v^) , (2.14) 

where a is given by the equation 

E* = a{kTf (2.15) 

in which k is Boltzmann’s constant and T is the absolute temperature. The above 
expression for the level density is for states of all spins and parities. 

In practical cases, E* is about 6 MeV for low- energy neutron interactions; this 
value leads to the following ratio for the mean level spacing at £* = 6 MeV and at 
E* = 0 (the ground state): 

(D(6MeV))/(D(0)) ft! 4-10^^ (2.16) 

For (£>(0)) = 100 keV (a practical value), the mean level spacing at £* = 6 MeV 
is Ri 4 • 10^^ eV, which is more than three orders-of-magnitude smaller than typical 
values observed in heavy nuclei. 

Many refinements of the model were introduced over the years; the models take 
into account spin, parity, and nucleon pairing effects. A frequently used refined form 
is 

piE*,J) = p(£*,0)(27+l)exp(-(7(7+l))/2(j2), (2.17) 

where a is called the “spin-cut-off parameter”; the value of is typically about 10. 
The predicted spacing distributions for two values of (7, and their comparison with 
a Wigner and an exponential distribution is shown in Figured 


2.4 The Optical Model 

In 1936, Ostrofsky et. al. lOBJI introduced a model of nuclear reactions that em¬ 
ployed a complex nuclear potential to account for absorption of the incoming nu¬ 
cleon. Later, Feshbach, Porter and Weisskopf IFPWI introduced an important devel¬ 
opment of the model that helped further our understanding of the average properties 
of parameters used to describe nuclear reactions at low energies. 

The following discussion provides insight into the physical content of their 
model. Consider the plane-wave solutions of the Schrodinger equation: 

^^ + (2m/h^)[E + VQ-\-iW](^ =0, 0 = exp(±!ki), (2.18) 

where the + sign indicates outgoing waves and the — sign indicates incoming 
waves. The wave number, k is complex: 
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Fig. 4 The spacing distribution of adjacent levels of the same spin and parity follows a Wigner 
distribution. For a completely random distribution of levels (in both spin and parity) the distribution 
function is exponential. The distributions for random superpositions of several sequencies (each of 
which is of a Wigner form with a characteristic spin and parity) are, for level densities given by 
\2A1\ and O' = 1 and 3, found to approach the exponential distribution. 


k = ^J{2m/h'^)[{E + V0) + iW], 


which can be written 


k = A:R + /riM. 


For W <{E + Vb) (h reasonable assumption) we have 


= 1/1 ~ ^J{2m/h2){E + V^)) 

Km = [W/{E + VQ)]{k/2). 


(2.19) 

( 2 . 20 ) 

( 2 . 21 ) 


Taking typical practical values £ = 10 MeV, Vq = 40 MeV and VT = 10 MeV, the 
wave numbers are ^r Ri 1.5fm— 1 and /Cjm ~ ^r/IO r; 0.15fm^*. 

We see that the outgoing solution of the wave equation is 


(j) = exp (ikR.r) exp (—/Tima:) , 


( 2 . 22 ) 


which represents an exponentially attenuated wave. The wave number £im is effec¬ 
tively an attenuation coefficient. The “decay length” associated with the probability 
function |0p is the “mean free path”: 
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A = \/2Km = {E + VQ)/Wk^. (2.23) 

Using the above values for the energies, we obtain A Ri 3.2fm. This value is of nu¬ 
clear dimension, and supports the underlying hypothesis of the Compound Nucleus 
Model. 

If the mean spacing of energy levels of a particle of mass m inside the compound 
nucleus is {D), and its wave number is K, then the particle covers a distance 

d r; {h/{D)){{hK/2nm) = {h^K)/{2nm{D)) (2.24) 

inside the nucleus at an average speed (v) r: hK/2nm before it is emitted (or before 
another indistinguishable particle is emitted). At an excitation energy of 10 MeV, 
a mean level spacing (D) r; 40 eV, and a mean lifetime h/{D) r 10^'® sec are 
predicted. These are reasonable values, considering the crudeness of the model. 

The level density and level widths increase as the neutron bombarding energy 
increases; an energy region is therefore reached in which the levels completely over¬ 
lap. Cross section measurements then provide information on the average properties 
of the levels and, in particular, on the neutron strength function ILTWI defined as 

5 = irinf/iD) (2.25) 

in which = is the average reduced neutron width and (D) is the average 

spacing. For i-wave neutrons, = '^kaFxn, where k is the neutron wave number, a 
is the nuclear radius, and is the neutron width of the level X. 

The average absorption cross section (aabs) may be obtained by averaging over 
the collision function U OLTWI . The following expressions are then obtained; 

i-Kt/)p = 2K{{rx„)/{D)) 

((Tabs) = {K/k^)g(l-\{U)\^'), (2.26) 

where g is a statistical “spin weighting factor”. 

The term 1 — |(t/)p is directly related to the cross section for the formation of 
a compound nucleus IIFPWI which is, in turn, proportional to the strength function. 
The importance of studying the spacing distribution of resonances, of a given spin 
and parity, originated in recognizing that the value of (D), the average spacing, 
appears as the denominator in the fundamental strength function. 


2.5 Further Developments 

The first numerical investigation of the distribution of successive eigenvalues asso¬ 
ciated with random matrices was carried out by Porter and Rozenzweig in the late 
1950s IPRI . They diagonalized a large number of matrices where the elements are 
generated randomly but constrained by a probability distribution. The analytical the- 
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ory developed in parallel with their work: Mehta MMehlL Mehta and Gaudin IIMGL 
and Gaudin OGaul . At the time it was clear that the spacing distribution was not 
influenced significantly by the chosen form of the probability distribution. Remark¬ 
ably, the nxn distributions had forms given almost exactly by the original Wigner 
2x2 distribution. 

The linear dependence of p(x) on the normalized spacing x (for small x) is a 
direct consequence of the symmetries imposed on the Hamiltonian matrix, H{hij). 
Dyson | Dyl| discussed the general mathematical properties associated with random 
matrices and made fundamental contributions to the theory by showing that different 
results are obtained when different symmetries are assumed for H. He introduced 
three basic distributions; in Physics, only two are important, they are: 


• the Gaussian Othogonal Ensemble (GOE) for systems in which rotational sym¬ 
metry and time-reversal invariance holds (the Wigner distribution): p{x) = {k/2) 
X exp(—(7r/4)x^); 

• the Gaussian Unitary Ensemble (GUE) for systems in which time-reversal invari¬ 
ance does not hold (Erench et. al. HEKPTI ): p[x) = (32/7r^)x^exp(—(7r/4)x^). 

The mathematical details associated with these distributions are given in OMehll . 

The impact of these developments was not immediate in nuclear physics. At the 
time, the main research endeavors were concerned with the structure of nuclei- 
experiments and theories connected with Shell-, Collective-, and Unified models, 
and with the nucleon-nucleon interaction. The study of quantum statistical mechan¬ 
ics was far removed from the mainstream. Almost two decades went by before ran¬ 
dom matrix theory was introduced in other fields of physics (see, for example, Bo- 
higas, Giannoni and Schmit OBGSI and Alhassid lAfl '). 


2.6 Lessons from Nuclear Physics 

We have discussed at great length the connections between nuclear physics and 
number theory, with random matrix theory describing the behavior in these two very 
different fields. Before we analyze in great detail the success it has had in modeling 
the zeros of L-functions, it’s worth taking a few moments to create a dictionary 
comparing these two subjects. 

In nuclear physics the main object of interest is the nucleus. It is a many-bodied 
system governed by complicated forces. We are interested in studying the internal 
energy levels. To do so, we shoot neutrons (which have no net charge) at the nucleus, 
and observe what happens. Ideally we would be able to send neutrons of any energy 
level; unfortunately in practice we can only handle neutrons whose energies are in 
a certain band. The more energies at our disposal, the more refined an analysis is 
possible. Einally, there is a remarkable universality from heavy nucleus to heavy 
nucleus, where the distribution of spacings between adjacent energy levels depends 
weakly on the quantum numbers. 
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Interestingly, there are analogues of all these quantities on the number theory 
side. The nucleus is replaced by an L-function, which is built up as an Euler product 
of many factors of arithmetic interest. We are interested in the zeros of this func¬ 
tion. We can glean information about them by using the explicit formula, (I1.171 i. We 
first choose an even Schwartz test function 0 whose Fourier transform 0 has com¬ 
pact support. The explicit formula relates sums of 0 at the zeros of the L-function to 
weighted sums of 0 at the primes. Thus the more functions 0 where we can success¬ 
fully execute the sums over the primes, the more information we can deduce about 
the zeros. Unfortunately, in practice we can only evaluate the prime sums for 0 with 
small support (if we could do arbitrary 0, we could take a sequence converging to 
the constant function 1, whose inverse Fourier transform would be a delta spike at 
the origin and thus tell us what is happening there). Similar to the weak dependence 
on the quantum numbers, the answers for many number theory statistics depend 
weakly on the Satake parameters (whose moments are the Fourier coefficients in 
the series expansion of the L-function). In particular, the spacing between adjacent 
zeros is independent of the distribution of these parameters, though other statistics 
(such as the distribution of the first zero or first few zeros above the central point) 
fall into several classes depending on their distribution. 

We collect these correspondences in the table below. While the structures studied 
in the two fields are very different, we can unify the presentations. In both settings 
we study the spacings between objects. While there are exact rules that govern their 
behavior, these are complicated. We gain information through interactions of test 
objects with our system; as we can only analyze these interactions in certain win¬ 
dows, we gain only partial information on the items of interest. 

Item Nuclear Physics Number Theory 

Object nucleus L-function 

Events energy levels zeros 

Probe neutron (no net charge) test function 0 (Schwartz) 

Restriction neutron’s energy supp(0) 

Individuality quantum numbers Satake parameters 


We end by extracting some lessons from nuclear physics for number theory. The 
first is the importance of using the proper test function, or related to that the proper 
statistic. In the gold-foil experiments (1908 to 1913) positively charged alpha parti¬ 
cles, which are helium nuclei, were used. Because they have a net positive charge, 
they are repelled by the nucleus they are probing. With the discovery of the neutron 
in 1932, physicists had a significantly better tool for studying the nucleus. As the 
machinery improved, more and more neutron energy levels were available, which 
led to sharper resolutions of the internal structure. We see variants of these on the 
number theory side, from restrictions on the test function to the consequences of 
increasing support. For example, when Wigner made his bold conjectures the data 
was not sufficiently detailed to rule out Poissonian behavior; that was not done until 
later when better experiments were carried out. Similar situations arise in number 
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theory, where some statistics are consistent with multiple models and only by in¬ 
creasing the support are we able to determine the true underlying behavior. Finally, 
while there is a remarkable universality in behavior of the zeros, as for statistics 
such as adjacent spacings or «-level correlations the exact form of the L-function 
coefficients do not matter, these distributions do affect the rate of convergence to 
the random matrix theory predictions, as well as govern other statistics. 


3 From Class Numbers to Pair Correlation and Random Matrix 
Theory 

The discovery that the pair correlation of the zeros of the Riemann zeta function 
(and other statistics of its zeros, and the zeros of other L-functions) are related to 
eigenvalues of random matrix ensembles has its beginnings with one of the most 
challenging problems in analytic number theory: the class number problem. Hugh 
Montgomery’s investigation into the vertical distribution of the nontrivial zeros of 
^{s) arose during his work with Weinberger IMWII on the class number problem. 
We give a short introduction to this problem to motivate Montgomery’s subsequent 
work on the differences between zeros of C(s)- We assume the reader is familiar 
with the basics of algebraic number theory and L-functions; an excellent introduc¬ 
tion is Davenport’s classic Multiplicative Number Theory IDal . For those wishing a 
more detailed and technical discussion of the class number problem and its history, 
see OG 0 I 3 IIG 0 I 4 I . We then continue with a discussion of Montgomery’s work on 
pair correlation, followed by the work of Odlyzko and others on spacings between 
adjacent zeros. After introducing the number theory motivation and results, we re¬ 
veal the connection to random matrix theory, and conclude with a discussion of the 
higher level correlations, other related statistics, and open problems. 

As there are too many areas of current research to describe them all in detail in a 
short article, we have chosen to concentrate on two major areas: the main terms for 
the «-level correlations, and the lower order terms; thus we do not describe many 
other important areas of research, such as the determination of moments or value 
distribution. The main terms are believed to be described by random matrix theory; 
however, the lower order terms depend on subtle arithmetic of the L-functions, and 
there we can see different behavior. The situation is very similar to that of the Central 
Limit Theorem, and we will describe these connections and viewpoints in greater 
detail below. 


3.1 The Class Number Problem 

Let K = Q(i/—O') be the imaginary quadratic field associated to the negative fun¬ 
damental discriminant —q. Here we have that —q is congruent to I (mod 4) and 
square-free or —q = Am, where m is congruent to 2 or 3 (mod 4) and square-free. 
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The class number of K, denoted h{—q), is the size of the group of ideal classes of K. 
When h{—q) = 1, the ring of integers of K, denoted ^K, has unique factorization. 
Such an occurrence (the class number one problem, discussed below) is rare, and 
the class number h{—q) may be thought of as a measure on the failure of unique 
factorization in ^k- 

One of the most difficult problems in analytic number theory is to estimate the 
size of h{—q) effectively. Gauss IGal showed that h{—q) is finite and further conjec¬ 
tured that h tends to infinity as —q runs over the negative fundamental discriminants. 
This conjecture was proved by Heilbronn IHel in 1934. Thus, while it is settled that 
there are only finitely many imaginary quadratic fields with a given class number 
h{—q), an obvious question remains: can we list all imaginary quadratic fields K 
with a given class number h{—q)l This is the class number problem. 

One may easily deduce an upper bound on h{—d) via Dirichlet’s class number 
formula. For 5R(s) > 1, let L{s,X-q) denote the Dirichlet L-function 

:= E^. 0.1) 

n=l ” 


where X-^(h) is the Kronecker symbol associated to the fundamental discrimi¬ 
nant —q. In order to prove the equidistribution of primes in arithmetic progression, 
Dirichlet derived the class number formula, 

h{-q) = (3.2) 

where w denotes the number of roots of unity of /T = Q{y/—q): 


w 


'2 ifq>A 
<4 if q' = 4 
6 if^ = 3. 


(3.3) 


Dirichlet needed to show L{\,X-q) ^ 0, which is immediate from the class num¬ 
ber formula as h{—q) > 1. This connection between class numbers and zeros of 
L-functions is almost 200 years old, and illustrates how knowledge of zeros of L- 
functions yields information on a variety of important problems. 

Instead of using the class number formula to prove non-vanishing of L-functions, 
we can use results on the size of L-functions to obtain bounds on the class num¬ 
ber. Combining ( I3.21 i with that fact that L{l,X-q) ^ log 9 ', it follows that h{—q) ^ 
y/q\ogq. On the other-hand, Siegel IISie2ll proved that for every e > 0 we have 
L{hX-q) > c{e)q where c(e) is a constant depending on e that is not numeri¬ 
cally computable for small e. Upon inserting this lower bound in (13.21) . it follows 
that h{—q) ^ however this does not help us solve the class number 
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problem because the implied constant is ineffective]^ Computing an effective lower 
bound on h(—q) is very difficult task. 

The class number one problem was eventually solved independently by Heegner 
OHeeL Stark IStll and Baker MBall . For h{—d) = 2, the class number problem was 
solved independently by Stark MSt2l . Baker IIBa2ll and Montgomery and Weinberger 
MMWI . In 1976, Goldfeld 0GollllGol2ll showed that if there exists an elliptic curve 
E whose Hasse-Weil L-function has a zero at the central point s = 1 of order at least 
three, then for any e > 0, we have h{—q) > C£_£log(| — where the constant 

C£_£ is effectively computable. In other words, Goldfeld proved that if there exists an 
elliptic curve whose Hasse-Weil L-function has a triple zero at i = 1, then the class 
number problem is reduced to a finite amount of computations. In 1983, Gross and 
Zagier IGZII showed the existence of such an elliptic curve. Combining this deep 
work of Gross-Zagier with a simplified version of Goldfield’s argument to reduce 
the amount of necessary computations, Oesterle lOell produced a complete list of 
imaginary quadratic fields with h{—q) = 3. To date, the class number problem is 
resolved for all 1 < h{—q) < 100. (In addition to the previous references, see Arnon 
IIArL Arnon, Robinson, and Wheeler MARWL Wanger MWanll and Watkins II Wall .) 

Combining their work with results of Stark IISt2ll and Lehmer, Lehmer, and 
Shanks ILLSL Montgomery and Weinberger gave a complete proof for the class 
number two problem. Their proof is based on the curious Deuring-Heilbronn phe¬ 
nomenon, which implies that if h{—d) < then the low-lying nontrivial zeros 

of many quadratic Dirichlet L-functions are on the critical line, at least up to some 
height depending on d, 5, and the L-functions. For an overview of the Deuring- 
Heilbronn phenomenon, see the survey article by Stopple IStol . Montgomery and 
Weinberger also establish that if the class number is a bit smaller, then one can 
show that these nontrivial zeros on the critical line are very evenly spaced. More¬ 
over, more precise information about the vertical distribution of these zeros would 
imply an effective lower bound on h{—d). Montgomery and Weinberger write; 

Let p = 1 /l + iy and p' = 1/2-1-!/ be consecutive zeros on the critical line of an L-function 
L{s, x)^ where is a primitive character (mod k). Put 

X(K) = min:^|7-/|logJi:, (3.4) 

I7t 

where the minimum is over all k < K, all x (mod k), and all p = 1/2 -|- iy of L{s, x) with 
l/l < 1. In this range the average of \y— /| is 2;r/logL so trivially limsupA(L) < 1. 
Presumably X{K) tends to 0 as A' increases; if this could be shown effectively then the 
effective lower bound h > rf’would follow. In fact the weak inequality A (AT) < 1/4 — 5 


® In other words, while the above is enough to prove that the class number tends to infinity, we 
cannot use that argument to produce an explicit constant Q„ for each n so that we could assert 
that the class number is at least n if q > Q„. One of the best illustrations of the importance of 
ejfective constants is the following joke: There is a constant To such that if all the non-trivial zeros 
of ij {s) in the critical strip up to height To are on the critical line, then they all are and the Riemann 
Hypothesis is true; in other words, it suffices to check up to a finite height! To see this, if the 
Riemann Hypothesis is true we may take To to be 0, while if it is false we take To to be 1 more than 
the height of the first exemption. We have therefore shown a constant exists, but such information 
is completely useless I 
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for K > Ko implies that h > for > C{KQ,e); the function C{Ko,e) can be made 

explicit. Even A (A') < j — 5 has striking consequences. 


3.2 Montgomery’s Pair Correlation of the Zeros of ^ (5) 

We have seen that the class number problem is related to another very difficult ques¬ 
tion in analytic number theory: What is the vertical distribution of the zeros of the 
Riemann zeta function (andgeneral L-functions) on the critical line? 

Given an increasing sequence and a box B C the «-level correla¬ 

tion is defined by 


iv™ N ■ ^ 

The pair correlation is the case n = 2, and through combinatorics knowing all the 
correlations yields the spacing between adjacent events (see for example IIMeh2l l. In 
1973, Montgomery IMonl was able to partially determine the behavior for the pair 
correlation of zeros of the Riemann zeta function, which led to new results on 
the number of simple zeros of ^(s) and the existence of gaps between zeros of ^{s) 
that are closer together than the average. One of the most striking contributions in 
Montgomery’s paper, however, is his now famous pair correlation conjecture. We 
first state his conjecture and then discuss related work on spacings between adja¬ 
cent zeros in the next subsection; after these have been described in detail we then 
revisit these problems and describe the connections with random matrix theory in 
£3] See ED for more on connections between spacings of zeros of ^ (s) and the 
class number. 


Conjecture 1 (Montgomery’s pair correlation conjecture). Assume the Riemann hy¬ 
pothesis, and let 7 , 7 ' denote the imaginary parts of nontrivial zeros of ^ (s). For fixed 
0 < a < b < °°, 


Y #{Y,Y ■ 0 < 7 ,y <T,27ta{logT) ' < 7 —/ < 2 ;rf 7 (log 7 ’) '} 

^logT 


I 


= 1 - 


sin;rM 

KU 


dll. 


(3.6) 


Thus Montgomery’s pair correlation conjecture is the statement that the pair corre¬ 
lation of the zeros of C, (s) is 


1 - 


SinTTM 


(3.7) 


KU J 
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Notice that the factor 1 — {^.iwKu/nuf' suggests a ‘repulsion’ between the ze¬ 
ros of C(i). The notion that the zeros cannot be too close to one another was also 
revealed in the aforementioned work of Montgomery and Weinberger as a conse¬ 
quence of the Deuring-Heilbronn phenomenon. 

To arrive at his conjecture, Montgomery introduced the function 

F{xJ) = Y. (3.8) 

o<r.y<7’ 

where w(m) is a weight function given by w{u) = 4/(4Let F{a) denote 
F{x,T) with X set as x = T"; then 

F{a) = F{a,T) = (^logT) Y (3.9) 

/ o<y.y<t 

where a and T >2 are real. F{a) is a real, even function. Let r{u) € L\ and define 
its Fourier transform by 


P(a) = J r{u)e^^‘°“‘du. (3.10) 

The function r is a test function that replaces the ‘box’ in the statement of the 
pair correlation conjecture [T] One notable item about Montgomery’s pair correla¬ 
tion conjecture is that there is no restriction on the length of the interval the 
difference b — ais permitted to be arbitrarily small. In the language of smooth test 
functions, this translates to permitting arbitrarily large support on the Fourier trans¬ 
form r. 

If r{a) G O, then upon multiplying (13.9b by f{a) and integrating, we deduce 

Y ^ (3.11) 

as T tends to infinity. If the Riemann hypothesis is true, the asymptotic (13.1 lb con¬ 
nects the pair correlation of i^(s) to the function F{a) given in ( 13.9b . Montgomery 
proceeded to prove an important special case of Conjecture [T] for a class of test 
functions with Fourier transform supported in (—1,1). 


Theorem 1 (Montgomery’s theorem). Assume the Riemann hypothesis. For real 
a, T >2, let F{a) be defined by (13.9b . Then F{a) is real, and F{a) = F{—a). If 
T > 7’o(£) then F{a) > —£ for all a. For fixed CL satisfying 0 < a < 1 we have 

F{a) = a-fo(l)-f7’-2“iog7’(l-bo(l)) (3.12) 

uniformly for 0 < CC < I as T tends to infinity. 
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Thus, for any function r{u) G L} with Fourier transform f{a) supported in 
(—1,1), one can use (13.12b to evaluate the sums appearing in (13.1 lb . For Ct > 1, 
Montgomery further conjectured, with heuristic arithmetic justification, that 

F{a) = l+o(l) uniformly in bounded intervals as r— 7 > oo. (3.13) 

This conjecture, combined with (13.12b gives a complete picture of the function 
F[a), which led Montgomery to make his pair correlation conjecture. 


3.3 Proof of Montgomery’s Pair Correlation Conjecture for 
Restricted a, b 


We now provide greater detail about Montgomery’s original proof BMonl §3, pp. 
187-191] of his theorem (Theorem[T]i. The point of entry is an explicit formula due 
to him. 

The role of explicit formulae cannot be overstated when working with ^ (i) or L- 
functions, as these formulae unlock the multiplicative structure implicit in the Euler 
product, usually via the argument principal applied to the logarithmic derivative. 
Assuming the Riemann hypothesis, and writing critical zeros of i^(s) as 1/2 + 17 
and 7 real, with 1 < (7 < 2 and x> 1, Montgomery proved that 



+ v'/2-^+''(logT + (9a(l))+(9a(v'/2T-l), (3.14) 

where T = |f | + 2 and the implied constants depend only on a. 

Proof (Proof of Montgomery’s theorem (Theorem\I}; ^Mon\ §3, pp. 187-191]). 
Placing (7 = 3/2 in (13.14b . and letting L{x,t) and R{x,t) denote the left and right 
sides, respectively, we now wish to evaluate the second moments of both sides; i.e. 
Jq \L{xf)\ at, Jq \R{xf)\ dt. The reason to do this is that, as we will see, F{a) 
falls out of the second moment of the left side, and we end up with something 
tractable for the second moment of the right side. Thus the equation of the two 
moments gives us an identity for F{a). 

By showing the contribution of those ordinates 7 above height T is (9(log^r), 
Montgomery obtained 



4 L 


Jiy-f) 


o<7<r 

0<7'<r 



dt 

(l + (t-y)2)(l + (f-/)2) 


+ 0{\og^T). 


(3.15) 
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Note that the range of integration may be extended to all of R at a penalty no greater 
in magnitude than (9(log^ T); we then have 



4 £ 

o<r<T 

o<7'<r 



dt 

(l + (f-y)2)(l + (f-/)2) 


+ 0{\og^T)- 


(3.16) 

it then follows from the residue calculus that the definite integral evaluates to w( 7 — 
y')nl2 and 



dt = 2 k ^ ^^w( 7 —/)+ (9(log^ r). 

o<y<r 
o<7<r 


Putting X = 7“ yields 

r \L{x,t)\^dt = F{a)T\ogT + Oi\og'T). 

Jo 


(3.17) 


(3.18) 


The non-negativity of the left side of (13.18b gives the statement in Theorem[T]of the 
positivity of F{a). (The evenness of F{a) follows from the fact that 7 and 7 ' may 
be interchanged in the definition (I3.9b .l It then falls to evaluate \R{xd)\ dt. First, 



FO{\ogT)) 


(3.19) 


for all x> 1,7 > 2. Montgomery then applied a quantitative version of Parseval’s 
identity for Dirichlet series to find 

2 

dt = Y,\a„\^{T + 0{n)). (3.20) 

n 

Applying (13.20b to the explicit formula (13.14b . we find 

2 



1 

X 





l/2+/f 






dt 


= {T + OM+'-ZMxf&V + Oin)) 

= 7(logx + 0(l))+0(xlogx), (3.21) 


where the last line follows from the prime number theorem with error term. It then 
follows from simple estimation of the error terms and a more delicate application of 
Cauchy-Schwarz that 

f \R{T^,t)\^dt = ((l+o(l))7-2“log7 + a-fo(l))71og7, (3.22) 

Jo 
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uniformly for 0 < a < 1 — e. Combining (13.181 1 and (13.22b yields Montgomery’s 
theorem. □ 


We end this section by describing the heuristic evidence that led Montgomery 
to conjecture (13.13b on the behavior of F(a) for a > 1. The argument above for 
proving Montgomery’s conjecture for 0 < a < 1 fails for a > 1, since error terms 
such as in (13.21b and those arising from Cauchy-Schwarz and the last line of (13.14b 
are no longer dominated by the main term. 

Examining the sum over primes from the explicit formula (13.14b with (7 = 3/2, 


Law© 

n<x 


- 1 / 2 +!-/ 


-IA(„)© 

n>x 


X \ 3/2+i; 


(3.23) 


the expected value is seen by the prime number theorem to be 


(5 + '0(i-'0' 

From the proof of Montgomery’s theorem we have, with E (x, T) as in (13.8b . that 


(3.24) 


E(x,r) = 


27rx 


r ea(«)(-) 

Jo \nj 


-l/l+it 


^X\3/2+* 


2 x 


\-it 


(5 + ' 0 ( i -'0 

it follows that we would like to know the size of 


Ea(«)(2)' 

dt + o{T\ogT)-, 


n>x 

2 


(3.25) 


f 




l/2-it ^ 


' n<x 


A(n); 


,-3/2-i, _ 


2xV2- 




dt. 


(3.26) 


Montgomery proceeded to multiply out and integrate term-by-term, finding that the 
non-diagonal is non-neglectable. He collected terms in the form of sums of the sort 


^A{n)A (n + h); 

n<y 


(3.27) 


invoking the Hardy-Littlewood k-tuple conjecture for 2-tuples with a strong error 
term, ( 13.27b should be x y. This would give 

F{x,T) - :^log7’ (3.28) 

271 

in X < T < x^^^, and there is little reason to expect the behavior to change for 
bounded a >2. On this basis, Montgomery made his conjecture (13.13b . 
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3.4 Spacings Between Adjacent Zeros 

Motivated by Montgomery’s pair correlation conjecture on the zeros of the Riemann 
zeta function, starting in the late 1970s Andrew Odlyzko began a large-scale compu¬ 
tation of zeros of {s) high in the critical strip. The average spacing between zeros 
of i^(i) at height T in the critical strip is on the order of l/logT; thus as we go 
higher and higher we have more and more zeros in regions of fixed size, and there 
is every reason to hope that, after an appropriate normalization, a limiting behavior 
exists. 

The story of computing zeta zeros goes back to Riemann himself. As mentioned 
in 1 11.2.21 in his one paper on the zeta function IrD . Riemann states the Riemann 
hypothesis (RH) in passing. He used a formula now known as the Riemann-Siegel 
formula to compute a few zeros of (i) up to a height of probably no greater than 
100 in the critical strip; though he did not mention these computations in the paper, 
the role of these computations was important in the development of mathematics 
and mirror the role played by the calculation of energy levels in nuclear physics 
in illuminating the internal structure of the nucleus. The formula was actually lost 
for almost 70 years, and did not enter the mathematics literature until Siegel was 
reading Riemann’s works ISiell . Siegel’s role in understanding, collecting, and in¬ 
terpreting Riemann’s notes should not be underestimated, since the expertise and 
insight needed to infer the ideas behind the notes was great. 

The development of the Riemann-Siegel formula proceeds along the purely clas¬ 
sical lines of complex analysis. Riemann had a formula for {s) valid for all s gC', 
namely. 



(3.29) 


where is the contour that starts at -|-oo, traverses the real axis towards the origin, 
circles the origin once with the positive orientation about 0, and then retraces its 
path along the real axis to -foo. 

By splitting off some finite sums from the contour integral above, Riemann ar¬ 
rived at the formula 



(3.30) 


where here s £ C, A,M € N are arbitrary, and is the contour that traces from -|-oo 
to {2M + l)7t, circles the line |s| = {2M + l)n once with positive orientation, and 
then returns to -foo, thereby enclosing the poles ±2niM^±.2ni{M — 1 ),...,±2 ;r!, 
and the singularity at 0. This formula for i^(i) can be regarded as an approximate 
functional equation, where the remainder is expressed explicitly in terms of the con¬ 
tour integral over . The main task in developing the Riemann-Siegel formula then 
falls to estimating the contour integral over using the saddle-point method. 
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Prior to Siegel’s work, in 1903 Gram showed that the first 10 zeros of (s) lie on 
the critical line, and showed that these 10 were the only zeros up to height 50. The 
development of the above, along with a cogent narrative of Riemann, Siegel, and 
Gram’s contributions, may be found in Edwards lEdl . 

In almost every decade in the last century, mathematicians have set new records 
for computations of critical zeros of ^ (s). Alan Turing brought the computer to bear 
on the problem of computing zeta zeros for the first time in 1950, when, as recounted 
by Hejhal and Odlyzko IIHOI . Turing used the Manchester Mark 1 Electronic Com¬ 
puter, which had 25,600 bits of memory and punched its output on teleprint tape in 
base 32, to verify every zero up to height 1540 in the critical strip (he found there are 
1104 such zeros). Turing also introduced a simplified algorithm to compute zeta ze¬ 
ros now known as Turing’s method. Turing published on his computer computations 
and his new algorithm for the first time in 1953 ITurll . 

Eollowing Turing, the computation of zeros of ^ (s) took off thanks to the increas¬ 
ing power of the computer. At this time, the first 10*^ nontrivial zeros of ^{s), tens 
of billions of nontrivial zeros around the 10^^ and 10^“*, and hundreds of nontriv¬ 
ial zeros near zero number 10^^ are known to lie on the critical line. Additionally, 
new algorithms by Schonhage and Odlyzko, and by Schonhage, Heath-Brown, and 
Hiary have sped up the verification of zeta zeros. 

However, the aforementioned projects for numerically checking that zeros of 
^(s) lay on the critical line were not concerned with accurately recording the height 
along the critical line of the zeros computed; only with ensuring the zeros had real 
part exactly 1 /2. This changed in the late 1970s with a series of computations by 
Andrew Odlyzko, who was motivated not only by the Riemann Hypothesis but also 
by Montgomery’s pair correlation conjecture. 

Rather than verify consecutive zeros starting from the critical point, Odlyzko was 
interested in starting his search high up in the critical strip, in the hope that near zero 
number 10^^, the behavior of i^(s) would be closer to its asymptotic behavior. Eor, 
as Montgomery’s pair correlation conjecture is a statement about the limit as one’s 
height in the critical strip passes to infinity, one would wish to know the ordinates of 
many consecutive zeta zeros in the regime where ^(s) is behaving asymptotically if 
one wished to test the plausibility of the conjecture. 

As he explains IIOd2l . his first computations llOdll were in a window around 
zero number 10*^, and were done on a Cray supercomputer using the Riemann- 
Siegel formula. These computations motivated Odlyzko and Arnold Schonhage to 
develop a faster algorithm for computing zeros OOd3IIOSl . which was implemented 
in the late 1980s and was subsequently used to compute several hundred million 
zeros near zero number 10^** and some near number 2 • 10^**, as seen in IIOd41IOd5ll . 


3.5 Number Theory and Random Matrix Theory Successes 

After its introduction as a conjecture in the late 1950s to describe the energy lev¬ 
els of heavy nuclei, random matrix theory experienced successes on both the nu- 
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merical and the experimental fronts. The theory was beautifully developed to han¬ 
dle a large number of statistics, and many of these predictions were supported as 
more and more data on heavy nuclei became available. While there was signifi 
cant theoretical progress (see, among others, |Dyl Dy2 IGaul IMehTlIMGI Wigl 


Wig2 Wig3 Wig4 Wig5 Wig6)), there were some gaps that were not resolved 


until recently. For example, while the density of normalized eigenvalues in matrix 
ensembles (Wigner’s semi-circle law) was known for all ensembles where the en¬ 
tries were chosen independently from nice distributions, the spacings between ad¬ 
jacent normalized eigenvalues resisted proof until this century (see, among others, 

OERSYIIES^ITvniTVll l. 

The fact that random matrix theory also had a role to play in number theory 
only emerged roughly twenty years after Wigner’s pioneering investigations. The 
cause of the connection was a chance encounter between Hugh Montgomery and 
Ereeman Dyson at the Institute for Advanced Study at Princeton. As there are now 
many excellent summaries and readable surveys of their meeting, early years and 
statistics (see in particular OHal for a Hollywoodized version), and the story is now 
well known, we content ourselves with a quick summary. Eor more, see among 


others llConTllCon2llDiT1lDilllKllKaSalllKaS^IKeSn31lMTOl . 

As described in ^3.11 Montgomery was interested in the class number, which led 
him to study the pair correlation of zeros of the Riemann zeta function. Given an 
increasing sequence and a box B C R", the «-level correlation is defined by 


A^—>00 fs } 


(3.31) 


the pair correlation is the case n = 2, and through combinatorics knowing all the 
correlations yields the spacing between adjacent events. Montgomery was partially 
able to determine the behavior for the pair correlation. When he told Dyson his 
result, Dyson recognized it as the pair correlation function of eigenvalues of random 
Hermitian matrices in a Gaussian Unitary Ensemble, GUE. 

This observation was the beginning of a long and fruitful relationship between 
the two areas. At first it appeared that the GUE was the only family of matrices 
needed for number theory, as there was remarkable universality seen in statistics. 
This ranged from work by Dennis Hejhal |Hej| on the 3-level correlation of the 
zeros of 1 ^ {s) and Zeev Rudnick and Peter Sarnak IRS I on the «-level correlation of 
general automorphic L-functions, to Odlyzko’s flOdlllOd2l striking experiments on 
spacings between adjacent normalized zeros. In all cases the behavior agreed with 
that of the GUE. 

In particular, Odlyzko’s computations of high zeta zeros showed that, high 
enough along the critical line, the empirical distribution of nearest-neighbor spac¬ 
ings for zeros of ^ (s) becomes more or less indistinguishable from that of eigenval¬ 
ues of random matrices from the Gaussian Unitary Ensemble, or GUE. The agree¬ 
ment with the first million zeros is poor, but the agreement near zero number 10*^ is 
close, near perfect near zero number 10*®, and even better near zero number 10^**. 
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These results provide massive evidence for Montgomery’s conjecture, and vindicate 
Odlyzko’s choice of starting his search high along the critical line; see Figure|3 


Nearest neighbor spacings 



0-0 0-5 1.0 1-5 2.0 2.5 3-0 


normalized spacing 


Fig. 5 Probability density of the normalized spacings S„. Solid line: GUE prediction. Scatterplot: 
empirical data based on Odlyzko’s computation of a billion zeros near zero #1.3 x 10*^. (From 
Odlyzko I0d2l Figure 1, p. 4].) 


In all of these investigations, however, the statistics studied are insensitive to the 
behavior of finitely many zeros. This is a problem, as certain zeros of L-functions 
play an important role. The most important of these are those of elliptic curve L- 
functions. Numerical computations on the number of points on elliptic curves mod¬ 
ulo p led to the Birch and Swinnerton-Dyer conjecture. Briefly, this states that the 
order of vanishing of the L-function at the central point equals the geometric rank of 
the Mordell-Weil group of rational solutions. The theorems on «-level correlations 
and spacings between adjacent zeros are all limiting statements; we may remove 
finitely many zeros without changing these limits. Thus these quantities cannot de¬ 
tect what is happening at the central point. 

Unfortunately for those who were hoping to distinguish between different sym¬ 
metry groups, Nick Katz and Peter Sarnak 0KaSalirKaSa2l showed in the nineties 
that the n-level correlations of the scaling limits of the classical compact groups are 
all the same and equal that of the GUE. Thus when we were saying number theory 
agreed with GUE we could instead have said it agreed with unitary, symplectic or 
orthogonal matrices. 

This led them to develop a new statistic that would be sensitive to finitely many 
zeros in general, and the important ones near the central point in particular. The 
resulting quantity is the n-level density. We assume the Generalized Riemann Hy¬ 
pothesis (GRH) for ease of exposition, so given an L{s,f) all the zeros are of the 
form 1/2 -F iyjj with Yj-,f real. The statistics are still well-defined if GRH fails, but 
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we lose the interpretation of ordered zeros and connections with nuclear physics. 
For more detail on these statistics see the seminal work by Henryk Iwaniec, Wenzhi 
Luo and Peter Sarnak MILSI . who introduced them (or BAAILMZII for an expanded 
discussion). 

Let (pj even Schwartz functions such that the Fourier transforms 

/ oo 

^j{x)e-^^^ydx (3.32) 

-oo 

are compactly supported, and set ^ (x) = 11^=1 {^j)- The n-level density for / with 

test function 0 is 


Dn{fA) = L (3-33) 


where is a scaling parameter which is frequently related to the conductor. Given 
a family ^ of L-functions with conductors tending to infinity, the n-level 

density D„{^^w) with test function (j) and non-negative weight function w is 
defined by 




L/GJTa, ^(/) 


(3.34) 


Katz and Sarnak BKaSall IKaSa2l conjecture that as the conductors tend to in¬ 
finity, the n-level density of zeros near the central point in families of L-functions 
agree with the scaling limits of eigenvalues near 1 of classical compact groups. 
Determining which classical compact group governs the symmetry is one of the 
hardest problems in the subject, though in many cases through analogies with a 
function field analogue one has a natural candidate for the answer, arising from the 
monodromy group. Unlike the «-level correlations, the different classical compact 
groups all have different scaling limits. As the test functions are Schwartz and of 
rapid decay, this statistics is sensitive to the zeros at the central point. While it was 
possible to look at just one L-function when studying correlations, that is not the 
case for the n-level density. The reason is that while one L-function has infinitely 
many zeros, it only has a finite number within a small, bounded window of the cen¬ 
tral point (the size of the window is a function of the analytic conductor). We always 
need do perform some averaging; for the «-level correlations each L-function gives 
us enough zeros high up on the critical line for such averaging, while for the «-level 
density we must move horizontally and look at family of L-functions. While the 
exact definition of family is still a work in progress, roughly it is a collection of 
L-functions coming from a common process. Examples include Dirichlet charac¬ 
ters, elliptic curves, cuspidal newforms, symmetric powers of GL(2) L-functions, 
Maass forms on GL(3), and certain families of GL(4) and GL(6) L-functions; see 
for example 0AA1LM7I \AM\ IDMTI lDM2l lER-GRI iFiMl E] |G^ iGUl EMI EB 
llLSllKaSa2llLMllMiTniMnPellO^IOS2llMllRollRubir^r^ . This correspon¬ 
dence between zeros and eigenvalues allows us, at least conjecturally, to assign a 
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definite symmetry type to each family of L-functions (see IIDM21[ShTel for more on 
identifying the symmetry type of a family). 

There are many other quantities that can be studied in families. Instead of look¬ 
ing at zeros, one could look at values of L-functions at the central point, or moments 
along the critical line. There is an extensive literature here of conjectures and re¬ 
sults, again with phenomenal agreement between the two areas. See for example 
flCFKRSII . 


4 Future Trends and Questions in Number Theory 

The results above are just a small window of the great work that has been done 
with number theory and random matrix theory. Our goal above is not to write a 
treatise, but to quickly review the history and some of the main results, setting the 
stage for some of the problems we think will drive progress in the coming decades. 
As even that covers too large an area, we have chosen to focus on a few problems 
with a strong numeric component, where computational number theory is provid¬ 
ing the same support and drive to the subject as experimental physics did years 
before. There are of course many other competing models for L-functions. One is 
the Ratios Conjectures of Conrey, Farmer and Zirnbauer OCFZIIICFZ21 ICSll . An¬ 
other excellent candidate is Gonek, Flughes and Keating’s hybrid model OGHKL 
which combines random matrix theory with arithmetic by modeling the L-function 
as a partial Hadamard product over the zeros, which is modeled by random matrix 
theory, and a partial Euler product, which contains the arithmetic. 

In all of the quantities studied, we have agreement (either theoretical or experi¬ 
mental) of the main terms with the main terms of random matrix theory in an ap¬ 
propriate limit. A natural question to ask is how this agreement is reached; in other 
words, what is the rate of convergence, and what affects this rate? In the interest of 
space we assume in parts of this section that the reader is familiar with the results 
and background material from OILS I IRS I . though we describe the results in general 
enough form to be accessible to a wide audience. 


4.1 Nearest Neighbor Spacings 

We first look at spacing between adjacent zeros, where Odlyzko’s work has shown 
phenomenal agreement for zeros of i^(s) and eigenvalues of the GUE ensemble. 
We plot the difference between the empirical and ‘theoretical,’ or ‘expected’ GUE 
spacings in Eigure|6] In his paper IIOd2l . Odlyzko writes: Clearly there is structure 
in this difference graph, and the challenge is to understand where it comes from. 

Recently, compelling work of Bogomolny, Bohigas, Leboeuf and Monastra 
OBBLMI provides a conjectural answer for the source of the additional structure 
in the form of lower-order terms in the pair correlation function for ^{s). Though 
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Nearest neighbor spacings: 



normalized spacing 


Fig. 6 Probability density of the normalized spacings 5„. Difference between empirical distribu¬ 
tion for a billion zeros near zero #1.3 x 10*®, as computed by Odlyzko, and the GUE prediction. 
(From Odlyzko IOd2l Figure 2, p. 5].) 


the main term is all that appears in the limit (where Montgomery’s conjecture ap¬ 
plies), the lower-order terms contribute to any computation outside the limit, and 
would therefore influence any numerical computations like those of Odlyzko. By 
comparing a conjectural formula for the two-point correlation function of critical 
zeros of l^{s) of roughly height T due to Bogomolny and Keating in IBKII with 
the known formula for the two-point correlation function for eigenvalues of uni¬ 
tary matrices of size N, Bogomolny et. al. deduce a recipe for picking a matrix size 
that will best model the lower-order terms in the two-point correlation function, and 
conjecture that it will be the best choice for all correlation functions, and therefore 
the nearest-neighbor spacing. More recently yet, Duenez, Huynh, Keating, Miller, 
and Snaith ODHKMSIIIDHKMS^ have applied techniques of Bogomolny et. al. 
and others to studying lower-order terms in the behavior of the lowest zeros of L- 
functions attached to elliptic curves. Their results are currently being extended to 
other L-functions by the first and third named authors here and their colleagues. 


4.2 n-Level Correlations and Densities 

The results of the studies on spacings between zero suggest that, while the arithmetic 
of the L-function is not seen in the main term, it does arise in the lower order terms, 
which determine the rate of convergence to the random matrix theory predictions. 
Another great situation where this can be seen is through the n-level correlations 
and the work of Rudnick and Sarnak IRSI . They proved that the u-level correla- 
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tions of all cuspidal automorphic L-functions L{s, n) have the same limit (at least in 
suitably restricted regions). Briefly, the source of the universality in the main term 
comes from the Satake parameters in the Euler product of the L-function, whose 
moments are the coefficients in the series expansion. In their Remark 3 they write 
(all references in the quote are to their paper): 

The universality (in 7t) of the distribution of zeros of L{s, n) is somewhat surprising, the rea¬ 
son being that the distribution of the coefficients a^lp) in (1.6), as p runs over primes, is not 
universal. For example, for degree-two primitive L-functions, there are two conjectured pos¬ 
sible limiting distributions for the a^{pYs: Sato-Tate or uniform distribution (with a Dirac 
mass term). As the degree increases, the number of possible limit distributions increases 
rapidly. Flowever, it is a consequence of the theory of the Rankin-Selberg L-functions (de¬ 
veloped by Jacquet, Piatetski-Shapiro, and Shalika for m > 3) that all these limiting distri¬ 
butions have the same second moment (at least under hypothesis (1.7)). It is the universality 
of the second moment that is eventually responsible for the universality in Theorems 1.1 
and 1.2. For the case of pair correlation (n = 2), this is reasonably evident; for n > 2 it was 
(at least for us) unexpected, and it has its roots in a key feature of “diagonal pairings” that 
emerges as the main term in the asymptotics of R„(T,f,h). 

Similar results are seen in the «-level densities. There we average the Satake param¬ 
eters over a family of L-function, and in the limit as the conductors tend to infinity 
only the first and second moments contribute to the main term (at least under the as¬ 
sumption of the Ramanujan conjectures for the sizes of these parameters). The first 
moment controls the rank at the central point, and the second moment determines 
the symmetry type (see 0DM21[ShTell ). For example, families of elliptic curves with 
very different arithmetic (complex multiplication or not, or different torsion struc¬ 
tures) have the same limiting behavior but have different rates of convergence to that 
limiting behavior. This can be seen in terms of size one over the logarithm of the con¬ 
ductor; while these terms vanish as the conductors tend to infinity, they are present 
for finite values. See 0Mil2llMii4l for several examples (as well as OMMRWI . where 
interesting biases are observed in lower order terms of the second moments in the 
families). 


4.3 Conclusion 

The number theory results above may be interpreted in a framework similar to that 
of the Central Limit Theorem. There, if we have ‘nice’ independent identically dis¬ 
tributed random variables, their normalized sum (standardized to have mean zero 
and variance 1) converges to the standard normal distribution. The remarkable fact 
is the universality, and that the limiting distribution is independent of the shape of 
the distribution. We quickly review why this is the case and then interpret our num¬ 
ber theory results in a similar vein. 

Given a distribution with finite mean and variance, we can always perform a 
linear change of variables to study a related quantity where now the mean is zero 
and the variance one. Thus, the first moment where the shape of the distribution is 
noticeable is the third moment (or the fourth if the distribution is symmetric about 
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the mean). In the proof of the Central Limit Theorem through moment generating 
functions or characteristic functions, the third and higher moments do not survive in 
the limit. Thus their effect is only on the rate of convergence to the limiting behavior 
(see the Berry-Esseen theorem), and not on the convergence itself. 

The situation is similar in number theory. The higher moments of the Satake 
parameters (which control the coefficients of the L-functions) again surface only in 
terms which vanish in the limit, and their effect therefore is seen only in the rate of 
convergence. 

This suggests several natural questions. We conclude with two below, which we 
feel will play a key role in studies in the years to come. These two questions provide 
a nice mix, with the first related to the main term and the second related to the rate 
of convergence. 


• Is Montgomery’s pair correlation true for all boxes (or test functions)? What 
about the n-level correlations, both for ^ {s) and cuspidal automorphic L-functions? 
Note agreement with random matrix theory for all these statistics implies the con¬ 
jectures on spacings between adjacent zeros. 

• For a given L-function (if we are studying «-level correlations) or a family of 
L-functions (if we are studying n-level densities), how does the arithmetic enter? 
Specifically, what are the possible lower order terms? How are these affected by 
properties of the L-functions? If we use Rankin-Selberg convolution to create 
new L-functions, how is the arithmetic of the lower order terms here a function 
of the arithmetic of the constituent pieces? 


There are numerous resources and references for those wishing to pursue these 
questions further. For the «-level correlations, the starting point are the papers BMonl 
Hej IRSI . while for the «-level densities it is OKaS a 1 1 iKaS a2l BESl . 
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